KFU at CLEF eHealth 2017 Task 1: ICD-10 Coding of English Death Certificates with Recurrent Neural Networks
نویسندگان
چکیده
This paper describes the participation of the KFU team in the CLEF eHealth 2017 challenge. Specifically, we participated in Task 1, namely “Multilingual Information Extraction ICD-10 coding” for which we implemented recurrent neural networks to automatically assign ICD10 codes to fragments of death certificates written in English. Our system uses Long Short-Term Memory (LSTM) to map the input sequence into a vector representation, and then another LSTM to decode the target sequence from the vector. We initialize the input representations with word embeddings trained on user posts in social media. The encoderdecoder model obtained F-measure of 85.01% on a full test set with significant improvement as compared to the average score of 62.2% for all participants’ approaches. We also obtained significant improvement from 26.1% to 44.33% on an external test set as compared to the average score of the submitted runs.
منابع مشابه
An Encoder-Decoder Model for ICD-10 Coding of Death Certificates
Information extraction from textual documents such as hospital records and healthrelated user discussions has become a topic of intense interest. The task of medical concept coding is to map a variable length text to medical concepts and corresponding classification codes in some external system or ontology. In this work, we utilize recurrent neural networks to automatically assign ICD-10 codes...
متن کاملCLEF eHealth 2017 Multilingual Information Extraction task Overview: ICD10 Coding of Death Certificates in English and French
This paper reports on Task 1 of the 2017 CLEF eHealth evaluation lab which extended the previous information extraction tasks of ShARe/CLEF eHealth evaluation labs. The task continued with coding of death certificates, as introduced in CLEF eHealth 2016. This largescale classification task consisted of extracting causes of death as coded in the International Classification of Diseases, tenth re...
متن کاملMulti-lingual ICD-10 Coding using a Hybrid rule-based and Supervised Classification Approach at CLEF eHealth 2017
In this paper we present our research efforts and obtained results within the CLEF eHealth challenge 2017, Track 1. The task involves the recognition and mapping of ICD-10 codes to English and French death certificates. Our approach proposes a two tier, two stage process. First, we use a rule-based system, based on handcrafted rules and the use of Apache Solr, to perform ICD-10 code Named Entit...
متن کاملICD10 Coding of Death Certificates with the NCBO and SIFR Annotator(s) at CLEF eHealth 2017 Task 1
The SIFR BioPortal is an open platform to host French biomedical ontologies and terminologies based on the technology developed by the US National Center for Biomedical Ontology (NCBO). The portal facilitates the use and fostering of terminologies and ontologies by offering a set of services including semantic annotation. The SIFR Annotator (http://bioportal.lirmm.fr/annotator) is a publicly ac...
متن کاملMultiple Methods for Multi-class, Multi-label ICD-10 Coding of Multi-granularity, Multilingual Death Certificates
We present concept detection and normalization experiments on the French and English CLEF eHealth 2017 death certificate datasets. For this purpose, we start from our last published system, which relied upon dictionary projection and supervised multi-class, mono-label text classification using simple features. We extend this system in several dimensions with multi-label classification and new f...
متن کامل